A Learning-Based Approach for Biomedical Word Sense Disambiguation
نویسندگان
چکیده
In the biomedical domain, word sense ambiguity is a widely spread problem with bioinformatics research effort devoted to it being not commensurate and allowing for more development. This paper presents and evaluates a learning-based approach for sense disambiguation within the biomedical domain. The main limitation with supervised methods is the need for a corpus of manually disambiguated instances of the ambiguous words. However, the advances in automatic text annotation and tagging techniques with the help of the plethora of knowledge sources like ontologies and text literature in the biomedical domain will help lessen this limitation. The proposed method utilizes the interaction model (mutual information) between the context words and the senses of the target word to induce reliable learning models for sense disambiguation. The method has been evaluated with the benchmark dataset NLM-WSD with various settings and in biomedical entity species disambiguation. The evaluation results showed that the approach is very competitive and outperforms recently reported results of other published techniques.
منابع مشابه
A Learning Approach for Word Sense Disambiguation in the Biomedical Domain
Word sense disambiguation, WSD, task has been investigated extensively within the natural language processing domain. In the biomedical domain, word sense ambiguity is more widely spread with bioinformatics research effort devoted to it is not commensurate and is allowing for more development. In this paper, we present and evaluate a machine learning based approach for WSD. The main limitation ...
متن کاملWord embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation
Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identif...
متن کاملBiomedical Word Sense Disambiguation with Neural Word and Concept Embeddings
OF THESIS Biomedical Word Sense Disambiguation with Neural Word and Concept Embeddings Addressing ambiguity issues is an important step in natural language processing (NLP) pipelines designed for information extraction and knowledge discovery. This problem is also common in biomedicine where NLP applications have become indispensable to exploit latent information from biomedical literature and ...
متن کاملA Comparative Study of an Unsupervised Word Sense Disambiguation Approach
Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. This is a significant problem in the biomedical domain where a single word may be used to describe a gene, protein, or abbreviation. In this paper, we evaluate SENSATIONAL, a novel unsupervised WSD technique, in comparison with two popular learning algorithms: support vector machines...
متن کاملKernel Fuzzy C-Means Clustering for Word Sense Disambiguation in
Word sense disambiguation (WSD) in biomedical texts is important. The majority of existing research primarily focuses on supervised learning methods and knowledge-based approaches. Implementing these methods requires significant human-annotated corpus, which is not easily obtained. In this paper, we developed an unsupervised system for WSD in biomedical texts. First, we predefine the number of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
دوره 2012 شماره
صفحات -
تاریخ انتشار 2012